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ABSTRACT 

A key challenge for the academic and biopharma- 
ceutical communities is the rapid and scalable pro- 
duction of recombinant proteins for supporting 
downstream applications ranging from therapeutic 
trials to structural genomics efforts. Here, we 
describe a novel system for the production of re- 
combinant mammalian proteins, including immune 
receptors, cytokines and antibodies, in a human 
cell line culture system, often requiring <3 weeks 
to achieve stable, high-level expression: Daedalus. 
The inclusion of minimized ubiquitous chromatin 
opening elements in the transduction vectors is 
key for preventing genomic silencing and maintain- 
ing the stability of decigram levels of expression. 
This system can bypass the tedious and 
time-consuming steps of conventional protein pro- 
duction methods by employing the secretion 
pathway of serum-free adapted human suspension 
cell lines, such as 293 Freestyle. Using optimized 
lentiviral vectors, yields of 20-100 mg/l of correctly 
folded and post-translationally modified, endotoxin- 
free protein of up to ~70kDa in size, can be 
achieved in conventional, small-scale (100 ml) 
culture. At these yields, most proteins can be 
purified using a single size-exclusion chromatog- 
raphy step, immediately appropriate for use in 
structural, biophysical or therapeutic applications. 



INTRODUCTION 

The ability to express milligram-to-gram quantities of 
highly purified, recombinant proteins has become essential 
to support cutting-edge therapeutic and structural 
genomics efforts. Available protein production platforms 
are commonly based on expression in bacterial, yeast, 
insect and, more recently, mammalian cell culture 
systems. Systems based on recombinant expression in 
Escherichia coli, one of the oldest and most widely used 
expression platforms, allows for the simple, rapid and 
cost-effective production of target proteins. However, eu- 
karyotic proteins produced in bacteria often suffer from 
poor solubility, resulting in aggregation or misfolding, or 
lack proper post-translational modifications necessary for 
full biological activity (1,2). Yeast-based expression 
systems, particularly those employing Saccharomyces 
cerevisiae and Pichia pastoris, offer improvements over 
bacterial systems in terms of yield, the complexity of 
proteins successfully expressed and the ability to 
perform some (but not all) post-translational modifica- 
tions (3,4). Baculovirus-based or stably transformed 
insect cell systems using Drosophila (S2) or Spodoptera 
frugiperda (SF9) derived cell lines have become widely 
used for routine production of complex recombinant 
proteins. However, insect cell line-based platforms can 
be cumbersome to implement and do not correctly recap- 
itulate complex mammalian N-glycans containing galact- 
ose or sialic acid residues (5,6). 

For these reasons, most recombinant proteins for bio- 
medical use have been produced in mammalian cell lines, 
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such as Chinese hamster ovary (CHO) or human embry- 
onic kidney (HEK) lines. Up to 70% of the recombinant 
proteins produced commercially are made in CHO cells, 
including many successful therapeutic biologies (e.g. 
Etanercept, Trastuzumab and Rituximab) (7). However, 
CHO-based expression systems may not be ideal for thera- 
peutic protein production due to the addition of terminal 
galactose-a-l,3-galactose epitopes during N-glycosylation 
of recombinant glycoproteins. This antigen can be respon- 
sible for allergic hypersensitivities leading to adverse 
clinical events such as anaphylaxis, as seen in the use of 
the anti-cancer antibody Cetuximab (8). Another major 
drawback of using mammalian cell line-based platforms 
is the frequent need to laboriously select stable, high ex- 
pressing clones which can be tedious, time consuming 
(often taking months) and costly. 

Purification of cytoplasmically targeted recombinant 
proteins requires cell lysis after culturing to high densities, 
an inherently piecemeal approach. However, stable trans- 
duction of recombinant proteins targeted to the secretion 
pathway generates producer lines that can be consistently 
maintained in culture, allowing the protein to be harvested 
periodically and purified directly from culture super- 
natants over a period of time. Secretion may also 
improve the expression of proteins that are toxic to the 
producer line when produced as cytoplasmically targeted 
constructs. Recent advances in developing cell lines 
adapted to serum-free media synergizes with this 
approach by minimizing serum protein contaminants, dra- 
matically simplifying purification. Indeed, most 
recombinantly expressed therapeutic proteins, such as 
cytokines and antibodies, are secreted proteins in their 
native state. 

Introduction of vectors encoding recombinant protein 
constructs for large-scale protein production in transient 
expression systems is often accomplished with costly and 
sometimes unreliable chemical transfection reagents. 
Alternatively, lentiviral vectors can be used to efficiently 
transduce cells yielding stable transductants (9,10). 
However, limitations of lentiviral vectors include con- 
strained packaging size (lentiviral particles are capable 
of packaging only about lOkb of DNA efficiently) and 
genomic silencing of integrated lentiviral transgenes, seen 
both in vitro and in vivo (11,12). Ubiquitous chromatin 
opening elements (UCOEs) have shown promise in main- 
taining high levels of protein expression over extended 
periods of time in in vitro transfection settings (13,14). 
UCOEs have also been incorporated into lentiviral 
vectors in order to prolong the expression of integrated 
lentiviral transgenes in gene therapy settings (15,16). The 
smallest element tested in these settings was a 2.0-kb 
fragment of the HNRPA2B1/CBX3 locus (UCOE2.0); 
however, given the size constraints of lentiviral packaging, 
the use of this fragment significantly limits the size of re- 
combinant protein constructs that can be expressed. 

Initially to address our own particular need to express a 
natively glycosylated mammalian protein (Siderocalin) for 
crystallographic studies, but also to more generally 
address the limitations of current technologies, we report 
the development of a generalizable protein expression 
platform using lentiviral transduction of serum-free 



adapted human 293 Freestyle (293-F) cells. This system, 
designated 'Daedalus', is capable of producing large 
quantities of readily purifiable, secreted recombinant 
protein in a rapid manner. To enhance this system, we 
have engineered a novel 0.7-kb UCOE fragment 
(UCOE0.7) that, when incorporated into our lentiviral 
vector, leads to the stable and enhanced expression of re- 
combinant proteins of sizes approaching 70kDa. In 
addition, we utilize a cw-linked fluorescent reporter 
(GFP) driven by an internal ribosome entry site (IRES) 
that allows for rapid detection of transduced populations, 
tracking relative protein expression levels, and obviates 
time-consuming steps required for drug selection and iso- 
lation of high expressing clones. Of the 14 proteins tested 
with the Daedalus system, 12 were successfully expressed 
at levels between 20 and 100mg/l in conventional 
lOOml-scale cultures and readily purified via a single chro- 
matography step for use in multiple applications. 

MATERIALS AND METHODS 

Cell culture 

293 Freestyle cells (Invitrogen) were grown in Freestyle 
293 Expression media (Gibco) with shaking at 130rpm, 
at 37°C and 8% C0 2 in vented 125-ml shake flasks 
(Nalgene). CHO-S cells (Invitrogen) were grown in CHO 
Expression media (Gibco) under the same conditions. 

Lentiviral vector constructs 

Lentiviral vector, pCVL-A (Figure 1A), was constructed 
from pRRL-SIN-cPPT-PGK-GFP-WPRE (Addgene# 
12252) by replacing the WPRE element with a WPRE-O 
fragment (17), inserting two copies of a polyadenylation 
(poly A) enhancer (18) into the 3' SIN LTR, and replacing 
the native U5-polyA signal with the bovine growth 
hormone (BGH) polyA signal sequence. The PGK 
promoter was replaced with the spleen focus forming 
virus (SFFV) promoter and the GFP sequence was 
replaced with an IRES-eGFP sequence. The murine 
Siderocalin sequence insert was isolated by PCR amplifi- 
cation of the expression construct used in E. coli with 
primers including Xho I and Bam HI sites plus a 
C-terminal Strep-tag II (SAWSHPQFEK) purification 
tag (19) and subcloned into the unique Xho I and Bam 
HI restriction sites in the pCVL-A vector; the resulting 
construct was designated pCVL-SFFV-muScn- 
IRES-GFP. The UCOE fragment within the human 
HNRPA2B1/CBX3 housekeeping gene locus (15) was 
amplified from Nairn 6 genomic DNA [using primers: 
UCOE5'Nhe I — GCTAGCggatcctggtacctaaaacagc and 
UCOE3'Sal I — GTCGACagtcgcttcagcccg (anti-sense)], 
TA-cloned and verified by sequencing and restriction 
mapping. The 2.0-kb Acc 65/Xba I fragment of the 
UCOE construct was blunted using Klenow (NEB) and 
ligated into Mlu I digested/blunted pCVL-SFFV-muScn- 
IRES-GFP upstream of the SFFV promoter. The forward 
orientation of the UCOE was confirmed by restriction 
mapping; the derived construct was designated 
pCVL-UCOE2.0-SFFV-muScn-IRES-GFP. A UCOE0.7 
fragment (containing the CBX3 promoter of UCOE2.0) 
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Figure 1. Maps of the lenti viral vectors used. Schematics are shown of 
(A) lentiviral construct pCVL-A, (B) non-UCOE containing construct 
pCVL-SFFV-muScn-IRES-GFP and (C) minimized UCOE containing 
construct pCVL-UCOE0.7-SFFV-muScn-IRES-GFP. SP, signal 
peptide; UCOE, ubiquitous chromatin opening element; SFFV, spleen 
focus-forming virus enhancer/promoter; muScn, murine Siderocalin; 
IRES, internal ribosome entry site. 



was constructed by digesting the UCOE2.0 fragment with 
Mlu I (internal) and Eco RI, the SFFV promoter was 
digested with Eco RI and Xho I and both fragments 
were cloned into the pCVL-SFFV-muScn-IRES-GFP 
vector (digested with Mlu I and Xho I) via a three-part 
ligation reaction. The final construct was designated 
pCVL-UCOE0.7-SFFV-muScn-IRES-GFP. All subse- 
quent candidate protein coding sequences were either 
amplified with PCR primers (available upon request) con- 
taining the required N-terminal signal peptide and a 
C-terminal STREP-tag II sequence, or gene synthesized 
(with codon utilization optimized for human expression) 
and cloned into pCVL-UCOE0.7-SFFV-muScn-IRES- 
GFP using the unique Xho I and Bam HI restriction 
sites, replacing the muScn sequence. All restriction 
enzymes were FastDigest, purchased from Fermentas. 

Lentivirus production, concentration, titration and 
transduction 

Lentivirus was produced by transient transfection of 293T 
(ATCC) cells using linear 25-kDa polyethyleneimine (PEI; 
Polysciences). Briefly, 4 x 10 6 cells were plated onto 10-cm 
tissue culture plates. After 24 h, 3 ug of psPAX2, 1.5 ug of 
pMD2G (Addgene plasmid #12260 and #12259, respect- 
ively) and 6 ug of lentiviral vector plasmid were mixed in 
500 nl diluent (5mM HEPES, 150mM NaCl, pH = 7.05) 
and 42 ul of PEI (1 mg/ml) and incubated for 15 min. The 
DNA/PEI complex was then added to the plate drop-wise. 
Lentivirus was harvested 48 h post-transfection, 



concentrated 100-fold by low-speed centrifugation at 
8000g for 18 h and titered on human Nalm6 cells by 
flow cytometry and qPCR as previously described (20). 
Transduction of the target cell line was carried out in 
six- well plates containing 2 x 10 6 cells per well in 2 ml of 
growth media and 4ug/ml hexadimethrine bromide 
(Polybrene; SIGMA). Titered virus was added to each 
well at the designated multiplicity of infection (MOI; 
four wells per condition) and the cells were incubated 
with shaking (130 rpm) at 37°C, in 8% C0 2 for 4-6 h. 
The cells were then harvested, pooling the replicate 
wells, and pelleted at low speed (1000g). Transduction 
media was removed and replaced with 20 ml fresh media 
and cells were transferred to a 125-ml vented shake flask 
(Nalgene). Copy number was estimated using a previously 
established genomic Q-PCR-based assay with comparison 
to a housekeeping gene as well as control cell lines with a 
defined viral integration number based on Southern 
blotting (21). 

Protein expression, purification and analysis 

Transduced 293-F cells (at least 1 week post-transduction) 
were seeded at 5 x 10 5 cells/ml in a 1-1 vented shaker flask 
(Nalgene) in 100 ml of 293 Expression media (Gibco). 
Twenty-five milliliters of fresh media was added 2-3 
days later, when cells reached densities of 2-3 x 10 6 
cells/ml. The media was harvested after 5 days of total 
incubation after measuring final cell concentration and 
viability. Culture supernatants were harvested by 
low-speed centrifugation to remove cells and filtered 
through a 0.22-micron Centricon ultrafilter (Millipore). 
NaCl was added to a final concentration of 250 mM and 
the supernatants were concentrated to final volumes of 
~5ml using a Vivacell-100 centrifugal concentrator 
(Sartorius Stedim). Recombinant proteins were separated 
from media by size-exclusion chromatography (SEC) on a 
Superdex 75 column (GE Healthsciences). Proteins were 
transferred to PVDF-FL (Millipore) membranes for 
western blot analysis with mouse anti-STREP primary 
(IBA) and an anti-mouse-Alexa 680 (Molecular Probes) 
secondary antibody, with results visualized using 
Li-COR fluorescent detection system (Odyssey). 
Endotoxin levels were measured by the Pyrogene endo- 
toxin detection system (Lonza) following the manufactur- 
er's protocols. 

Crystallization and structure determination 

Crystals of muScn were obtained by hanging drop vapor 
diffusion at 25°C. The protein was concentrated to 12 mg/ 
ml in standard phosphate-buffered saline (PBS) and mixed 
1:1 v/v with a reservoir solution of 0.1 M HEPES (pH 7.5), 
10% v/v isopropanol and 20% w/w PEG 4000. Crystals 
grew overnight and were cryoprotected in mother liquor 
containing 10% v/v glycerol. Diffraction data were col- 
lected at the Advanced Light Source 5.0.1 beamline 
(Lawrence Berkeley National Laboratory). Data were 
reduced with d*TREK (22). Initial structure factor 
phase information was determined by molecular replace- 
ment, using the program PHASER (23) as a part of the 
CCP4i program suite (24). The structure of human 
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Siderocalin (PDB code: 1X89, chain A) was used as a 
search model. Model building and refinement were 
carried out using COOT (25) and refmac5 (26). The struc- 
ture was submitted to the TLSMD server (27,28) and TLS 
refinement using a single group was applied to the final 
model. Structure validation was carried out with the 
MolProbity server (29) and the RCSB ADIT server. 

Functional assays 

Bone marrow was harvested from C57BL/6 mice (Jackson 
laboratories) and plated in a six-well dish at 1 x 10 6 
cells/ml in 2 ml of lymphocyte media (RPM1 1640 
(Hyclone) supplemented with 10% FCS (Omega 
Scientific), 5mM L-Alanyl-L-Glutamine (Mediatech), 
lOOIU/ml penicillin-streptomycin (Mediatech) and 
2mM P-mercaptoethanol (Gibco)) with 5ng/ml of 
Daedalus IL-7 or commercial IL-7 (PeproTech). Media 
with IL-7 was replaced at Day 3 and the resulting cell 
populations were evaluated by fluorescence-activated cell 
sorting (FACS) at Day 5. Cells were stained with fluores- 
cein isothiocyanate (FITC) anti-mouse IgM (Jackson 
Immuno Research) and PE anti-mouse B220 (BD) in 
FACS staining buffer (PBS with 2.5% FCS) and 
analyzed on an LSRII flow cytometer (BD) and using 
FlowJo software (Tree Star, Ashland, OR, USA). 
Recombinant human LIF was tested for its ability to 
maintain pluripotency of murine ES cells. Briefly, 
murine ES cells (AB1 ES cells kindly provided by Steve 
Jones, University of Massachusetts) were plated on 
irradiated feeder cells with either commercial human 
LIF (Millipore) or Daedalus recombinant human LIF at 
lOng/ml. Cells were split every 2 days with fresh media 
and stained with APC anti-SSEAl (BD) for analysis by 
flow cytometry. 

RESULTS 

Expression of murine Siderocalin using lentiviral 
transduction of 293 Freestyle cells 

The coding sequence of murine Siderocalin (muScn) was 
cloned into the optimized lentiviral backbone pCVL-A 
(Figure 1A) resulting in the construct pCVL-SFFV- 
muScn-IRES-GFP (Figure IB). The ability of this 
lentiviral vector to transduce 293-F cells was compared 
to the more commonly used CHO-S cell line. Compared 
to CHO-S cells, the 293-F cells were more receptive to 
viral transduction, at almost all multiplicities of infection 
(MOI; 0.25-10), as measured by the higher mean fluores- 
cence intensity (MFI) of the GFP reporter. 293-F cells 
were also more resistant to virus-mediated toxicity at the 
highest MOI (10; Supplementary Figure SI). 

293-F cells were transduced with the muScn lentivirus at 
three different MOIs (Figure 2A) and GFP levels were 
measured by flow cytometry at 1 week post-transduction. 
Essentially 100% of the cells were GFP positive 
(Figure 2A) and muScn secretion was detectable even by 
Coomassie blue staining of sodium dodecyl sulfate-poly- 
acrylamide gel electrophoresis (SDS-PAGE) analyses of 
unconcentrated, 20 ul samples of media supernatants 
as soon as 7 days post-transduction (Figure 2B). 



The identity of the protein was confirmed by western 
blot using an anti-STREP antibody (Figure 2B). The 
IRES-GFP reporter proved to be a good surrogate for 
protein expression as the increase in protein levels 
correlated well with the increase in GFP levels at all 
three MOIs. Effective protein yield in a 125-ml culture 
was ~13mg/l of fully glycosylated muScn; proper 
folding was confirmed by comparative reduced/ 
non-reduced SDS-PAGE and SEC. 

Use of a novel, minimized UCOE to stabilize high-level 
expression of muScn 

Consistent with previous observations of silencing of lenti- 
virus integrants in vitro and in vivo (11,12), long-term 
culture (up to 1 month) of the transduced cells resulted 
in a steady decline in GFP MFI and corresponding 
protein levels (Figure 3A). In order to enhance protein 
production and stably maintain expression levels for 
extended periods, we incorporated two alternative 
UCOE elements derived from the CpG island within the 
human HNRPA2B1/CBX3 locus, either UCOE2.0 or 
UCOE0.7. Expression levels from vectors incorporating 
either element were comparable by initial GFP MFI 
signal and were stable over extended periods of up to 1 
month (Supplementary Figure S2). However, expression 
levels, as measured by GPF MFI, from transductants 
incorporating the truncated UCOE0.7 were substantially 
enhanced relative to transductants lacking UCOEs (at 
matched copy numbers of ~ 8- 10 copies) and were stable 
over at least 1 month, consistent with resistance to 
silencing (Figure 3B). The loss of GFP in the 
transductants lacking the UCOE element was not due to 
loss of the integrated gene as the viral copy number did 
not vary significantly over the 1 -month period (data not 
shown). It is possible to further enhance and prolong 
protein expression by sorting for the highest GFP 
positive (GFP + ) cells (Supplementary Figure S3). This 
additional step, however, was not required and was 
omitted for testing the expression/purification of the 
multiple subsequent proteins reported here. 

Expression, purification and crystallization of 
293-expressed muScn 

The goal of this effort was to produce large quantities of 
muScn to support crystallographic studies, successful 
completion of which would confirm the proper folding 
of the recombinant protein. At MOIs of 10-20, 
pCVL-UCOE0.7-SFFV-muScn-IRES-GFP efficiently 
transduced 293-F cells leading to a 100% GFP + popula- 
tion, obviating any requirement for single-cell cloning or 
sorting. A 200-ml culture incubated for 5 days yielded 
~6mg of protein (~37mg/l) after purification by prepara- 
tive SEC (Table 1 and Figure 4A). This yield was almost 
3-fold higher than the yield obtained using the lentiviral 
vector lacking UCOE. The protein obtained was 
glycosylated and >99% pure (Figure 4A, inset). Final 
culture density in this experiment was only about 
3-4 x 10 6 cells per ml, compared to typical endpoint 
densities of 8-10 x 10 6 cells per ml for CHO cell culture, 
highlighting the efficiency of recombinant protein 
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Figure 2. Transduction of 293-F cells at varying multiplicity of infection (MOI). (A) GFP MFI (mean fluorescence intensity) is shown for three 
different doses of pCVL-SFFV-muScn-IRES-GFP virus. (B) Analyses of expression of levels of muScn are shown; 20-ul aliquots from media 
supernatants from each transduction were separated by SDS-PAGE and stained with Coomassie blue (left) or analyzed by western blot with an 
anti-Strep II tag antibody (right). 
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Figure 3. Comparison of non-UCOE vs. UCOE0.7 containing constructs expressing murine Siderocalin. (A) GFP MFI measured over a 1-month 
period for pCVL-SFFV-muScn-IRES-GFP. (B) GFP MFI, for the same period of time, for pCVL-UCOE0.7-SFFV-muScn-IRES-GFP. Cells were 
matched for copy number and data are representative of three replicate experiments. 



expression in this 293-F-based system. The fully 
glycosylated, 293-F-expressed muScn was readily crystal- 
lizable (Figure 4B) where non-glycosylated, recombinant 
versions of muScn had previously been recalcitrant to 
crystallization. The crystal structure (i m ; n = 1.8A) of 
293-F-expressed muScn showed that the protein was 
properly folded, with expected N-linked glycosylation 
and intrachain disulfide bond formation readily apparent 



(Table 2 and Figure 4C). The time required to go from 
viral transduction to structure determination in this 
example was only 18 days. Even though it had been suc- 
cessfully crystallized with protein recombinantly expressed 
from baculovirus and bacterial systems, human 
Siderocalin (huScn) was also expressed (at a matched 
copy number) using the same protocols for comparison. 
The effective yield of huScn after purification and 
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"Residues for constructs with a non-native signal peptide correspond to sequence that was cloned downstream of the heterologous signal peptide. 
b Molecular weights are listed according to the estimated molecular weight of the mature protein, lacking the signal peptide and any modifications. 
c Canine G-CSF contains a single O-linked glycosylation site. 



muScn 



— 290 

< 

E. 

g 230 
C 

ra 

Q 
v- 

o 

in 170 
£ 
< 



+ PNGase 



-PNGase 

-Glycosylated 

"Deglycosylated 



contaminants 




60 80 100 

Retention Volume (mL) 




Figure 4. Purification and crystallization of murine Siderocalin. (A) Preparative Superdex 75 SEC trace for muScn expressed using the optimized 
lentiviral construct pCVL-UCOE0.7-SFFV-muScn-IRES-GFP. SDS-PAGE analysis of the peak fraction, before and after PNGase digestion, is 
shown (inset). (B) Crystals obtained using the PEGs II Suite (Qiagen) sparse matrix screen, condition 63 [0.1 M HEPES (pH 7.5), 10% v/v 
isopropanol and 20% w/w PEG 4000]. (C) Crystal structure of muScn (red) superimposed on that of huScn (1X89 in black). Electron density 
from a refined F 0 -F c omit map contoured at 2.0 a showing the presence of the N-linked glycan and the intrachain disulfide bond. 
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Table 2. Data collection and refinement statistics (molecular 
replacement) 

Data collection 



Space group 


P2 1 2 1 2 1 


Lattice constants (A) 


a = 42.89 b = 43.96 c = 101.92 


Resolution (A) 


42.89-1.8 (1.86-1.80) 


Unique reflections 


18 574 


Average redundancy 


6.87 (7.09) 


Completeness (%) 


99.9 (100.0) 


^mertje (%) 


4.6 (39.4) 


I/<r(I) 


14.7 (3.5) 


Refinement 




-Rwork (%) 


20.08 


«free (%) 


22.61 


Number of atoms 




Protein 


1341 


Ligand 


50 


Water 


89 


RMSD from ideal values 




Bond lengths (A) 


0.015 


Bond angles (°) 


1.50 


Chiral volume (A 3 ) 


0.07 


Ramachandran 




Most favored (%) 


92.6 


Additionally allowed (%) 


6.1 


Generously allowed (%) 


0.0 


Disallowed (%) 


1.4 



concentration was ~47mg/l under similar culture condi- 
tions, a modest increase over muScn (Table 1). 

Expression of other secreted proteins 

In order to assess the general applicability of the Daedalus 
system, a panel of secreted proteins was then tested for 
expressibility: two additional human lipocalins, Lipocalin 
15 (Lcnl5) and glycodelin (Gd); the soluble ectodomain of 
the human major histocompatibility complex class I 
homolog MICA; the isolated, immunoglobulin-like a3 
domain of MICA (MICAa3; Thr81-Ser274); and two 
antibody single-chain Fv (scFv) constructs, derived from 
either the anti-HIV antibody 4E10 or the anti-canine 
CD28 antibody 5B6. All proteins were analyzed by ana- 
lytical SEC (Supplementary Figure S5A-S5C) and SDS 
PAGE under reducing and non-reducing conditions 
(Supplementary Figure S5D); yields were calculated 
using the BCA protein assay (Table 1). 

Lcnl5 has, as yet, no known physiological function; 
periplasmic expression of Lcnl5 in E. coli had previously 
produced only modest yields of ~ 1 mg of purified protein 
per liter of culture. Gd can be isolated in multiple, func- 
tionally relevant glycoforms (GdA, GdC, GdF and GdF) 
from the female reproductive tract and seminal plasma 
and primarily modulates sperm function (30,31). It had 
been shown previously that HEK293 cells are capable of 
producing functional GdA (32). Lcnl5 and Gd were 
cloned into the UCOE0.7-containing vector and expressed 
and purified following the same methodology used for 
Sen, resulting in yields of 94 and 103mg/l respectively, a 
100-fold improvement in Lcnl5 yield over periplasmic 
bacterial expression. 

MICA is a conditionally expressed self-antigen 
recognized by the widely expressed, activating 
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immunoreceptor NKG2D (33). MICA, solubilized by 
truncation prior to the transmembrane-spanning 
segment at residue 274, had previously been produced as 
a secreted protein in a baculovirus expression system (34), 
yielding up to 7 mg/1 after harvesting 5 days post infection, 
or by refolding denatured bacterial inclusion bodies 
in vitro (35), an arduous and expensive (typically >$900 
per milligram of purified protein) process that requires 
considerable experimental finesse. Yields of purified 
MICA range up to 1.1 mg/1 of input bacterial culture by 
in vitro refolding. Refolded MICA, lacking N-linked 
oligosaccharides, tended to aggregate over time. 
Daedalus expression of MICA yielded up to 82 mg/1 of 
purified, natively glycosylated protein that was stable in 
solution over at least 3 months following purification, a 
74-fold improvement over the bacterial system. MICAa3, 
a species that expressed poorly by every other technique 
tried, was targeted to the secretion pathway by fusion with 
a heterologous huScn signal peptide for testing in the 
Daedalus system, but failed to yield measurable amounts 
of protein. 

Antibody scFv constructs recapitulate the binding 
properties of the mature immunoglobulin (except 
bivalent avidity); the 4E10 scFv was refolded from bacter- 
ial inclusion bodies at yields up to 0.8 mg per liter of 
culture (36); Daedalus expression, using either huScn or 
human IgK signal peptides, yielded up to 25 mg/1 of 4E10 
scFv (Table 1), a 31 -fold improvement, binding properties 
of Daedalus-expressed and refolded 4E10 scFv were iden- 
tical in surface plasmon resonance interaction analyses 
(data not shown). The heavy and light variable regions 
of the 4E10 were joined by a thrombin-cleavable linker 
(sequence: LVPRGSGGGGLVPRGS), yielding pure 
monobodies after cleavage; Daedalus-expressed scFv 
cleaved as readily as refolded scFv, with thrombin added 
directly to culture supernatants after concentration and 
prior to purification. A second scFv, based on antibody 
5B6, was not purifiable from any conventional recombin- 
ant system and was available only as sequence informa- 
tion, as the parent hybridoma proved to be unstable. 
Unfortunately, the 5B6 scFv also failed to produce meas- 
urable yields in the Daedalus system. 

Expression of functional, endotoxin-free, cytokines 

Recombinant cytokines have become widely used in the 
treatment of cancer (37), primary immunodeficiency 
diseases (38), anemia and in conditioning patients 
receiving hematopoietic stem cell transplants (HSCTs) 
(39). However, recombinant cytokine therapeutics gener- 
ally display limited plasma half-lives due to their innate 
instability and rapid in vivo clearance, at least partially 
accounted for by a lack of glycosylation, which can be 
overcome by expression with native glycans, improving 
efficacy and stability (40,41). 

Therefore, we also evaluated the capacity of the 
Daedalus system to produce a panel of candidate cyto- 
kines and assessed whether these recombinant proteins 
exhibited normal levels of glycosylation and activity. We 
evaluated human interleukin-7 (IL-7), human leukemic in- 
hibitory factor (LIF), canine interleukin-3 (IL-3) and 



el43 Nucleic Acids Research, 2011, Vol. 39, No. 21 



Page 8 of 1 1 



canine granulocyte colony-stimulating factor (G-CSF). 
IL-7 is essential for lymphocyte development, prolifer- 
ation and activity; and therapeutically studied for 
immune reconstitution following stem cell transplantation 
and chemotherapy, as a vaccine adjuvant, and, more 
recently, to treat sepsis (42). LIF is a multifunctional 
growth factor that acts on several cell types. It is heav- 
ily glycosylated (LIF has seven putative sites) and it 
has been suggested that the glycosylation pattern, at 
least of rat LIF, is essential for its function (43). LIF 
has more recently become an important cytokine 
for maintaining embryonic stem cells (ESCs) in an undif- 
ferentiated state for their use in regenerative therapies (44) 
as well as for developing induced pluripotent stem 
cells (iPSCs) (45). Canine orthologs of IL-3 and G-CSF 
have been useful for studying hematopoietic HSCT in 
a canine model (46). Canine IL-3 displays low se- 
quence identity to orthologs from other species (47) 
necessitating the use of species-matched reagents in the 
canine model. 

Coding sequences for expression constructs for all four 
cytokines were synthesized, with codon utilization 
optimized for human expression, and inserted into the 
UCOE0.7 containing vector; IL-7, LIF and IL-3 were ex- 
pressed with their native signal peptide. G-CSF is en- 
dogenously expressed as a pro-peptide that is further 
processed to its mature form; for Daedalus expression, 
the coding sequence for the mature protein was placed 
downstream from the IL-3 signal peptide sequence that 
scored higher than the native signal peptide by the 
SignalP 3.0 algorithm (48). All four proteins were success- 
fully expressed and purified with substantial yields 
(Table 1), were glycosylated as seen by a shift in molecular 
weight by SDS-PAGE (Supplementary Figure S5D) and 
PNGase treatment of IL-7 and LIF (Supplementary 
Figure S5E). All four proteins were also tested for endo- 
toxin levels and were on average <10 EU/mg and IL-7 and 
LIF were tested for activity. IL-7 is known to promote 
B-cell development in culture and was tested using 
mouse bone marrow cells (49). The IL-7 produced using 
the Daedalus system performed identically to commercial 
IL-7 (Supplementary Figure S4A) in its ability to promote 
and maintain B-cell (B220+IgM+) development of mouse 
bone marrow cells in culture. The recombinant LIF was 
also functionally tested in its ability to maintain mouse ES 
cells in an undifferentiated state, as measured by mainten- 
ance of the cell-surface marker SSEA1 in culture 
(Supplementary Figure S4B) (50); there was no difference 
compared to commercial LIF. 

Expression of cytoplasmic proteins 

One advantage of the Daedalus system is the ease with 
which secreted proteins can be purified from the 
serum-free supernatants of transduced cells. To determine 
whether the Daedalus system could be adapted to 
cytoplasmically targeted proteins, the expression of a cyto- 
plasmic protein engineered with a heterologous signal 
peptide (from huScn) was tested for efficient secretion 
into media supernatants. Caspase-recruitment domain 
(CARD)-membrane-associated guanylate kinase protein 



1 (CARMA1) is essential for lymphocyte activation and 
immune function (51). The N-terminal CARD domain of 
CARMA1 (CARMA-CARD) interacts with that of B-cell 
lymphoma 10 (Bel- 10) and is essential for lymphocyte ac- 
tivation via nuclear factor-KB (NF-kB) (52). CARMA- 
CARD was successfully expressed as a cytoplasmically 
targeted construct in E. coli (Bandanarayake et ai, manu- 
script in preparation) using a combination His 6 /maltose 
binding protein purification tag, fused to CARMA- 
CARD through a tobacco etch virus (TEV) protease rec- 
ognition sequence. Typical yields of CARMA-CARD 
after tag cleavage and purification from E. coli were 
~10mg/l (Supplementary Figure S6B). Despite successful 
expression, this recombinant form failed to crystallize 
despite exhaustive screening. For testing Daedalus 
expressibility, the huScn signal peptide sequence was 
fused in-frame with the coding sequence of human 
CARMA-CARD (Asp2-Thrl04) and the fusion was 
inserted into the pCVL-UCOE0.7-SFFV-IRES-GFP 
vector backbone. A 100-ml culture of 293-F cells, 
transduced with recombinant lentivirus produced as 
described above, yielded 2.5 mg (25mg/l) of protein after 
final SEC purification (-95% pure by PAGE). 
Interestingly, SEC analysis of the Daedalus-expressed 
CARMA-CARD protein showed a mixture of monomelic 
and dimeric forms (Supplementary Figure S6A), whereas 
the SEC analysis of bacterially expressed protein had 
shown only the monomeric form (Supplementary Figure 
S6B). The Daedalus CARMA-CARD homodimer was 
isolated by preparative SEC and found to be a stable, 
interchain disulfide-linked species as shown by compara- 
tive reduced/non-reduced SDS-PAGE analysis 
(Supplementary Figure S6C). Another CARD domain 
protein, NODI, also homodimerizes by forming a 
strand-swapped dimer, stabilized by an adventitious 
interchain disulfide when purified (53), suggesting that 
the Daedalus system allows CARMA-CARD to access a 
folding pathway not available in the cytoplasm of 
bacteria. While possibly not physiological, the 
CARMA-CARD interchain disulfide may also represent 
an adventitious linkage, potentially in context of an analo- 
gous strand-swapped dimer. Preliminary crystals of the 
Daedalus-expressed CARMA-CARD dimer were 
obtained, but require further optimization. 



DISCUSSION 

Lipocalin 2 (also known as Siderocalin, NGAL or 24p3) is 
a bacteriostatic, innate immune system defense, secreted, 
disuhide-containing glycoprotein that interferes with sid- 
erophore mediated iron uptake in bacteria (54). huScn has 
been successfully expressed recombinantly in both bacter- 
ial (<15mg/l) and baculovirus (<5mg/l) expression 
systems in order to support successful crystallographic 
analyses (55,56). However, murine Siderocalin (muScn), 
despite being successfully expressed in analogous bacterial 
expression systems (yielding <15mg/l), proved refractory 
to crystallization. Since the presence or absence of 
N-glycans can affect crystallizability, an alternate expres- 
sion system was sought that could produce readily 
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purifiable, natively glycosylated muScn at the multi- 
milligram levels needed to support crystallization trials. 
Using the Daedalus system, muScn was expressed at 
decigram levels and purified (in a single SEC step) to 
crystallizability within 18 days. Daedalus expression 
levels for muScn (~37mg/l) and huScn (~46mg/l) dra- 
matically exceeded yields from alternate systems. 

One dozen additional expression targets were tested, 
including additional lipocalins, truncated cell-surface 
glycoproteins, an antibody scFv construct, cytokines and 
cytoplasmic CARMA-CARD; 10 of these 12 targets were 
successfully expressed at yields up to lOOmg/ml, 
demonstrating the wide potential utility and high potential 
expression levels of the Daedalus platform, even when 
limited to conventional shaking-flask cell densities. The 
293-F secretion system also ensured that only properly 
folded and stable proteins were exported (57), adding an 
additional internal quality control that could also be 
applied to cytoplasmically targeted proteins by fusion 
with heterologous signal peptide sequences (huScn). 
Avoidance of bacteria during protein expression also 
yields recombinant protein free of contaminating endo- 
toxins without additional purification steps. 

Due to the packing limitations of lentiviruses, the 
maximum construct that can be expressed in the 
UCOE0.7 vector is ~2.0kb, corresponding to M r s about 
70 kDa. Further optimization of the vector may increase 
this limit, but larger inserts may be packaged by deletion 
of the IRES-GFP cassette, which, although useful for 
rapid selection, is not necessary for viral or protein pro- 
duction. This would increase the insert size by ~lkb 
which would enable the production of, for instance, 
full-length antibody chains. 

In summary, we have developed a rapid and scalable 
expression system in 293-F cells by stabilizing the expres- 
sion of recombinant proteins using a lentiviral delivery 
vector. A key element of this invention was the incorpor- 
ation of a minimized UCOE to prevent genomic silencing 
of integrated expression cassettes. While previous studies 
have utilized large UCOE elements (up to 8kb) within 
plasmid constructs to obtain high expression clones in 
mammalian cells (58,59), the Daedalus system exhibits a 
number of significant advantages including increased 
speed, no requirement for producer cell selection or 
cloning, and demonstrated capacity to generate a broad 
range of functional proteins. The Daedalus system takes 
advantage of both the more efficient initial transduction 
rates in 293-F cells and viral elements that permit sus- 
tained high-level protein expression without a requirement 
for cell cloning. The Daedalus platform is simple and 
robust enough to enable academic labs to rapidly 
produce recalcitrant proteins for research and may be 
adaptable enough for therapeutic manufacturing and 
other biotechnology applications. 
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